This report aims to investigate how physical activity impacts study habits.
Key findings:
The more time spent performing physical activity, the longer respondents were able to undergo focused study.
The way individuals perform physical activity has strong impacts on how energised they were before study.
Data was collected using (Google Forms)[https://docs.google.com/forms/d/e/1FAIpQLSdoKyCJjY2eqYijesKuMEWtXnKf2gN7jZkD40PZ7peMhT8kPw/viewform]
data <- read.csv("data.csv")
# loading packages
library(dplyr)
library(plotly)
# show unclean data
str(data)
## 'data.frame': 37 obs. of 10 variables:
## $ X1..Do.you.partake.in.physical.activity. : chr "Yes" "Yes" "Yes" "Yes" ...
## $ X2..How.many.days.per.week.do.you.partake.in.physical.activity. : int 4 3 6 4 2 4 1 1 2 1 ...
## $ X3..How.many.hours.of.physical.activity.do.you.partake.in.each.week.: int 8 3 9 4 3 10 2 3 3 1 ...
## $ X4..How.would.you.describe.the.intensity.of.your.physical.activity. : chr "High intensity" "Low intensity" "Moderate intensity" "Moderate intensity" ...
## $ X5..What.type.of.physical.activity.do.you.partake.in. : chr "Resistance training" "Incidental activity" "Cardiovascular training" "Cardiovascular training" ...
## $ X6..Do.you.tend.to.study.before.or.after.physical.activity. : chr "Before" "After" "After" "Before" ...
## $ X7..Do.you.generally.feel.energised.and.refreshed.before.study. : chr "Usually" "Sometimes" "Usually" "Usually" ...
## $ X8..How.many.hours.of.study.do.you.undergo.per.week. : int 25 30 25 20 10 30 15 60 50 45 ...
## $ X9..How.many.hours.of.focused.study.can.you.undergo.in.one.sitting. : chr "2-3 hours" "1-2 hours" "1-2 hours" "3-4 hours" ...
## $ X : logi NA NA NA NA NA NA ...
# incorrect data
data$X <- NULL
# rename variables for ease of use
data <- rename(data, "physical_activity_status" = "X1..Do.you.partake.in.physical.activity.")
data <- rename(data, "physical_activity_days_per_week" = "X2..How.many.days.per.week.do.you.partake.in.physical.activity.")
data <- rename(data, "physical_activity_hours_per_week" = "X3..How.many.hours.of.physical.activity.do.you.partake.in.each.week.")
data <- rename(data, "intensity" = "X4..How.would.you.describe.the.intensity.of.your.physical.activity.")
data <- rename(data, "type" = "X5..What.type.of.physical.activity.do.you.partake.in.")
data <- rename(data, "study_before_after" = "X6..Do.you.tend.to.study.before.or.after.physical.activity.")
data <- rename(data, "energised_refreshed_status" = "X7..Do.you.generally.feel.energised.and.refreshed.before.study.")
data <- rename(data, "study_hours_per_week" = "X8..How.many.hours.of.study.do.you.undergo.per.week.")
data <- rename(data, "focused_study" = "X9..How.many.hours.of.focused.study.can.you.undergo.in.one.sitting.")
# chr -> factor for ease of use
data$physical_activity_status <- as.factor(data$physical_activity_status)
data$intensity <- as.factor(data$intensity)
data$type <- as.factor(data$type)
data$study_before_after <- as.factor(data$study_before_after)
data$energised_refreshed_status <- as.factor(data$energised_refreshed_status)
data$focused_study <- as.factor(data$focused_study)
# show cleaned data
str(data)
## 'data.frame': 37 obs. of 9 variables:
## $ physical_activity_status : Factor w/ 1 level "Yes": 1 1 1 1 1 1 1 1 1 1 ...
## $ physical_activity_days_per_week : int 4 3 6 4 2 4 1 1 2 1 ...
## $ physical_activity_hours_per_week: int 8 3 9 4 3 10 2 3 3 1 ...
## $ intensity : Factor w/ 4 levels "High intensity",..: 1 3 4 4 1 4 3 3 3 3 ...
## $ type : Factor w/ 4 levels "Cardiovascular training",..: 4 3 1 1 3 4 2 3 1 3 ...
## $ study_before_after : Factor w/ 2 levels "After","Before": 2 1 1 2 2 1 2 1 2 1 ...
## $ energised_refreshed_status : Factor w/ 5 levels "Always","Never",..: 5 4 5 5 4 5 4 5 2 4 ...
## $ study_hours_per_week : int 25 30 25 20 10 30 15 60 50 45 ...
## $ focused_study : Factor w/ 5 levels "< 1 hours","> 4 hours",..: 4 3 3 5 3 2 3 3 3 3 ...
Using the survey, qualitative and quantitative data was collected through the use of multiple-choice and short-answer questions. There are 9 variables and 37 survey responses.
Selection bias: The majority of respondents are students at USYD, and many show different survey responses to those who are not university students in USYD.
Consent Bias: The mode of data collection (survey) inherently introduces consent bias because it gives respondents the choice on whether or not to participate.
Does the amount of time spent training influence the amount and quality of study?
# physical_activity_days_per_week vs frequency barplot
plot1 <- ggplot(data=data,
aes(x=physical_activity_days_per_week)) +
geom_bar() +
xlab("Number of days per week physical activity is done") +
ylab("Frequency") +
theme_bw() +
scale_x_continuous(breaks=c(1, 2, 3, 4, 5, 6, 7))
# physical_activity_hours_per_week vs study_hours_per_week on scatter
plot2 <- ggplot(data=data,
aes(x=physical_activity_hours_per_week,
y=study_hours_per_week)) +
geom_point() +
xlab("Number of hours per week physical activity is done") +
ylab("Number of hours per week study is done") +
theme_bw() +
geom_smooth(method="lm")
cor(data$physical_activity_hours_per_week, data$study_hours_per_week)
## [1] -0.2769296
# physical_activity_days_per_week vs study_hours_per_week on scatter
plot3 <- ggplot(data=data,
aes(x=physical_activity_days_per_week,
y=study_hours_per_week)) +
geom_point() +
xlab("Number of days per week physical activity is done") +
ylab("Number of hours per week study is done") +
theme_bw() +
geom_smooth(method="lm")
cor(data$physical_activity_days_per_week, data$study_hours_per_week)
## [1] -0.3231418
# physical_activity_days_per_week vs focused_study barplot
plot4 <- ggplot(data=data,
aes(x=physical_activity_days_per_week,
fill=focused_study)) +
geom_bar() +
xlab("Number of days per week physical activity is done") +
ylab("Frequency") +
theme_bw() +
scale_x_continuous(breaks=c(1, 2, 3, 4, 5, 6, 7))
# focused_study vs physical_activity_hours_per_week barplot
plot5 <- ggplot(data=data,
aes(x=focused_study,
y=physical_activity_hours_per_week)) +
geom_boxplot() +
xlab("Number of hours of focused study completed in one sitting") +
ylab("Number of hours per week physical activity is done") +
theme_bw() +
scale_x_discrete(limits=c("< 1 hours", "1-2 hours", "2-3 hours", "3-4 hours", "> 4 hours"))
plotly::ggplotly(plot1)
plotly::ggplotly(plot2)
plotly::ggplotly(plot3)
plotly::ggplotly(plot4)
plotly::ggplotly(plot5)
Respondents who performed physical activity a greater number of hours per week spent fewer hours per week studying (plot2). This trend was reflected when observing how many days per week physical activity was performed (plot3). A possible explanation for this could be the time commitment of physical activity, but it is difficult to be certain without further investigation.
However, respondents who performed physical activity a greater number of days per week were able to undergo longer periods of focused study in one sitting compared to respondents who performed physical activity a fewer number of days per week (plot4). This trend was reflected when observing how many hours per week of physical activity were performed (plot5). Research shows a strong correlation between physical activity and concentration.
Overall, there was an inverse effect when observing how the number of days per week physical activity is performed impacts; the number of hours per week study, and the number of hours of focused study in one sitting.
model <- lm(data$study_hours_per_week ~ data$physical_activity_hours_per_week)
summary(model)
##
## Call:
## lm(formula = data$study_hours_per_week ~ data$physical_activity_hours_per_week)
##
## Residuals:
## Min 1Q Median 3Q Max
## -23.773 -4.634 0.366 6.401 26.227
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 36.8777 3.5975 10.251 4.41e-12 ***
## data$physical_activity_hours_per_week -1.0348 0.6069 -1.705 0.0971 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 11.59 on 35 degrees of freedom
## Multiple R-squared: 0.07669, Adjusted R-squared: 0.05031
## F-statistic: 2.907 on 1 and 35 DF, p-value: 0.09705
residual_plot <- ggplot(data=model,
aes(x=.fitted,
y=.resid)) +
geom_point() +
geom_hline(yintercept=0, linetype="dashed", colour="red") +
ggtitle("Residual Plot") +
xlab("Time spent studying per week (hours)") +
ylab("Residuals") +
theme_bw()
plotly::ggplotly(residual_plot)
## Warning: `gather_()` was deprecated in tidyr 1.2.0.
## ℹ Please use `gather()` instead.
## ℹ The deprecated feature was likely used in the plotly package.
## Please report the issue at <]8;;https://github.com/plotly/plotly.R/issueshttps://github.com/plotly/plotly.R/issues]8;;>.
How does the way an individual trains influence how refreshed they are before study?
# type barplot with evergised_refreshed_status fill
plot6 <- ggplot(data=data,
aes(x=type,
fill=energised_refreshed_status)) +
geom_bar() +
xlab("Type of physical Activity") +
ylab("Frequency") +
theme_bw()
plotly::ggplotly(plot6) # more useful plot
Respondents who performed physical activity of the types: resistance training or flexibility training were more energetic before study.
Respondents who performed physical activity of the type: incidental activity were less energetic before study.
Respondents who performed physical activity of the type: cardiovascular training showed minimal impact on how energetic they were before study.
# intensity barplot with energised_refreshed_status fill
plot7 <- ggplot(data=data,
aes(x=intensity,
fill=energised_refreshed_status)) +
geom_bar() +
xlab("Intensity of physical activity") +
ylab("Frequeny") +
theme_bw() +
scale_x_discrete(limits=c("Low intensity", "Moderate intensity", "High intensity", "High performance athlete intensity"))
plotly::ggplotly(plot7)
The higher the intensity of physical activity, the more energetic respondents were before study.
A limitation is that there is less data on respondents who performed physical activity at a high performance athlete intensity - this can be partially explained due to the fact that there are less people who train at such a high level compared to people who train at a lower intensity. Thus, it is difficult to provide trends when observing respondents who perform at a high performance athlete intensity.
Overall, respondents who performed physical activity at higher intensities showed higher levels of energy before study, but further investigation is required.
# study_before_after barplot with energised_refreshed_status fill
plot8 <- ggplot(data=data,
aes(x=study_before_after,
fill=energised_refreshed_status)) +
geom_bar() +
xlab("Study before or after physical activtiy") +
ylab("Frequency") +
theme_bw()
plotly::ggplotly(plot8)